Towards the Processing of Historic Documents

نویسندگان

  • Björn Gottfried
  • Lothar Meyer-Lerbs
چکیده

This chapter describes methods required for transforming complex document images into texts. The goal is to make the contents of those documents available for search engines, which are not born-digital but converted from a physical medium to a digital format. Established optical character recognition methods fail for documents for which no assumptions can be made regarding the, probably unknown, symbols contained in the document, historic documents being the example domain par excellence. This paper, however, has a much broader goal: it outlines fundamental problems as well as a methodology in the dealing with documents containing unknown and arbitrary symbols in order to provide a basis for discussions and future work within the digital library community. In particular, future advances will more closely require the interaction of researchers concerned with such diverse topics as document digitisation, reproduction, and preservation as well as search engines, cross-language processing, mobile libraries, and many further areas. Adopting a general view on the presented issues, researchers of the aforementioned areas should be sensitised for the problems met in processing complex, especially historic documents.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presentation of Economic Regeneration Model in Historic Fabric Based on Order in Structural Functionalism Theory

Historic fabric can perform an important role in the development of cities. Urban sustainable regeneration is one of the recent approaches in historic fabric. In this approach, all indicator of sustainable development including economic, social, cultural, management and environmental dimensions have been used in conservation of the historic fabric. All the principles of sustainable development ...

متن کامل

بازخوانی اسناد کتیبه‌ای غیرمنقول در میراث جهانی مجموعه بازار تاریخی تبریز

Immovable inscriptions are considered as one of the most important works and among the historical documents in cultural assets of our dear country, which were installed on selected parts of historical buildings and outstanding monuments and were always noticeable. The role of inscriptions as the basic and effective tools is important in terms of manifesting and implication of educational and ed...

متن کامل

Natural environment of Zayande-Rood and the Safavid development of Isfahan

Isfahan is a historic city that has experienced several urban developments throughout its shining and glorious past. They began in Al-buyid and Seljuq periods, and continued through the Safavid urban evolution in the sixteenth century. Zayande-Rood is an important and effective natural element in the city's landscape and plan. This article reflects the conclusion of a historic study on revit...

متن کامل

Typology of Tarmeh in the Historic city of Bushehr

In the Architecture of Iran, the vernacular architecture of Bushehr’s historic city is distinctive since it is both introverted and extroverted and has different semi-open interior and exterior spaces. Tarameh is one of these semi-open spaces. Using a qualitative research method, conducting desk and field studies and reviewing the technical documents of the historic buildings of Bushehr, this p...

متن کامل

A Cross-Language Approach to Historic Document Retrieval

Our cultural heritage, as preserved in libraries, archives and museums, is made up of documents written many centuries ago. Largescale digitization initiatives make these documents available to nonexpert users through digital libraries and vertical search engines. For a user, querying a historic document collection may be a disappointing experience: queries involving modern words may not be ver...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009